{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# LongRAG example\n",
    "\n",
    "This LlamaPack implements LongRAG based on [this paper](https://arxiv.org/pdf/2406.15319).\n",
    "\n",
    "LongRAG retrieves large tokens at a time, with each retrieval unit being ~6k tokens long, consisting of entire documents or groups of documents. This contrasts the short retrieval units (100 word passages) of traditional RAG. LongRAG is advantageous because results can be achieved using only the top 4-8 retrieval units, and long-context LLMs can better understand the context of the documents because long retrieval units preserve their semantic integrity."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"<Your API Key>\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Usage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below shows the usage of `LongRAGPack` using the `gpt-4o` LLM, which is able to handle long context inputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.packs.longrag import LongRAGPack\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from llama_index.core import Settings\n",
    "\n",
    "Settings.llm = OpenAI(\"gpt-4o\")\n",
    "\n",
    "pack = LongRAGPack(data_dir=\"./data\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "Pittsburgh can become a startup hub by leveraging its increasing population of young people, particularly those aged 25 to 29, who are crucial for the startup ecosystem. The city should encourage the youth-driven food boom, support independent restaurants and cafes, and focus on historic preservation to maintain its unique character. Additionally, making Pittsburgh more bicycle and pedestrian-friendly and capitalizing on its first-rate research university, CMU, can further attract talent and foster innovation. CMU should focus on being an even better research university rather than setting up specific innovation programs.\n",
       "\n",
       "There are two types of moderates: intentional moderates and accidental moderates. Intentional moderates deliberately choose positions midway between the extremes of right and left, while accidental moderates make up their own minds about each issue, resulting in a broad range of opinions that average to a moderate position. Intentional moderates' beliefs are acquired in bulk and shift with the median opinion, whereas accidental moderates' beliefs are independently chosen and not necessarily aligned with any ideological group."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from IPython.display import Markdown, display\n",
    "\n",
    "query_str = (\n",
    "    \"How can Pittsburgh become a startup hub, and what are the two types of moderates?\"\n",
    ")\n",
    "res = pack.run(query_str)\n",
    "display(Markdown(str(res)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other parameters include `chunk_size`, `similarity_top_k`, and `small_chunk_size`.\n",
    "- `chunk_size`: To demonstrate how different documents are grouped together, documents are split into nodes of `chunk_size` tokens, then re-grouped based on the relationships between the nodes. Because this does not affect the final answer, it can be disabled by setting `chunk_size` to None. The default size is 4096.\n",
    "- `similarity_top_k`: Retrieves the top k large retrieval units. The default is 8, and based on the paper, the ideal range is 4-8.\n",
    "- `small_chunk_size`: To compare similarities, each large retrieval unit is split into smaller child retrieval units of `small_chunk_size` tokens. The embeddings of these smaller retrieval units are compared to the query embeddings. The top k large parent retrieval units are chosen based on the maximum scores of their smaller child retrieval units. The default size is 512."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "Pittsburgh can become a startup hub by leveraging its increasing population of young people, particularly those aged 25 to 29, who are crucial for the startup ecosystem. The city should encourage the youth-driven food boom, preserve historic buildings, and enhance its bicycle and pedestrian infrastructure to make it more attractive to young talent. Additionally, Carnegie Mellon University (CMU) should focus on being an even better research university to attract ambitious individuals. Although Pittsburgh lacks a significant investor community, the decreasing cost of starting startups and alternative funding sources like Kickstarter and Y Combinator can help mitigate this issue.\n",
       "\n",
       "There are two types of moderates: intentional moderates and accidental moderates. Intentional moderates deliberately choose positions midway between the extremes of right and left, while accidental moderates form their opinions independently on each issue, resulting in a broad range of views that average out to a moderate position."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "pack = LongRAGPack(data_dir=\"./data\", chunk_size=None, similarity_top_k=4)\n",
    "query_str = (\n",
    "    \"How can Pittsburgh become a startup hub, and what are the two types of moderates?\"\n",
    ")\n",
    "res = pack.run(query_str)\n",
    "display(Markdown(str(res)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Vector Storage\n",
    "\n",
    "The vector index can be extracted and be persisted to disk. A `LongRAGPack` can also be constructed given a vector index. Below is an example of persisting the index to disk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import typing as t\n",
    "from llama_index.core import VectorStoreIndex\n",
    "\n",
    "modules = pack.get_modules()\n",
    "index = t.cast(VectorStoreIndex, modules[\"index\"])\n",
    "index.storage_context.persist(persist_dir=\"./paul_graham\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below is an example of loading an index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "To transform Pittsburgh into a startup hub, several strategies can be employed. Encouraging a youth-driven food boom, preserving historic buildings, capitalizing on the city's density, making it more bicycle and pedestrian-friendly, and leveraging the presence of Carnegie Mellon University (CMU) are key steps. Additionally, fostering a tolerant and pragmatic culture and gradually building an investor community are crucial.\n",
       "\n",
       "Regarding the two types of moderates, they can be classified as intentional moderates and accidental moderates. Intentional moderates deliberately choose positions midway between extremes, while accidental moderates form their opinions independently on each issue, resulting in a middle-ground stance on average."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from llama_index.core import StorageContext, load_index_from_storage\n",
    "\n",
    "ctx = StorageContext.from_defaults(persist_dir=\"./paul_graham\")\n",
    "index = load_index_from_storage(ctx)\n",
    "pack_from_idx = LongRAGPack(data_dir=\"./data\", index=index)\n",
    "query_str = (\n",
    "    \"How can Pittsburgh become a startup hub, and what are the two types of moderates?\"\n",
    ")\n",
    "res = pack.run(query_str)\n",
    "display(Markdown(str(res)))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama-index-RvIdVF4_-py3.11",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
